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Abstract 



The paper proposes a general notion of interaction between attributes, which 
can be apphed to many fields in decision making and data analysis. It generalizes 
the notion of interaction defined for criteria modelled by capacities, by considering 
functions defined on lattices. For a given problem, the lattice contains for each 
attribute the partially ordered set of remarkable points or levels. The interaction 
is based on the notion of derivative of a function defined on a lattice, and appears 
^ ' as a generalization of the Shapley value or other probabilistic values. 
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1 The concept of interaction: an introduction 



Let us consider a set A^ of criteria describing the preferences of a decision maker (DM) 
over a set X of objects, alternatives, etc. We assume that for any object a; G X, we are 
^ ■ able to build a vector of scores (oi, . . . , a„) describing the satisfaction of the DM for x, 

w.r.t. each criterion. For this reason, and in order to remain at an abstract level, we call 
this vector a tuple, which we identify with the object or alternative. We may suppose 
for the moment that scores are given on the real interval [0, 1], with and 1 having the 
meaning of "unacceptable" and "totally satisfying" respectively. 

We make the simplifying assumption that the preference of the DM is solely deter- 
mined by binary tuples, i.e. whose scores are either or 1 on each criterion, the preference 
for other tuples being more or less an interpolation between binary tuples. More precisely, 
denoting by (1^^, O^c) the binary tuple having a score of 1 for all criteria in v4 C A^, and 
elsewhere, this amounts to assigning an overall score v{A) in [0, 1] to {Ia^^A'^)- Doing 
this for all A C A^, we have defined a set function v : 2^ — > [0, 1]. 

Although this is not essential in the sequel, we may impose to v some natural prop- 
erties. First, we may set t;(0) := and v{N) := 1, since A = (resp. A^) corresponds to 



a binary act having all its scores being equal to (resp. 1). Second, considering A ^ B, 
this leads to two binary tuples of which one dominates the other, in the sense that on 
each criterion one is at least as good as the other. Then it seems natural to impose 
v{A) < v{B). This is called isotonicity. A set function v satisfying these two conditions 
is called a capacity [3\ (also called fuzzy measure [20\). 

Let us now consider the case n = 2 in some detail. There are 4 binary tuples (0,0), 
(0, 1), (1,0) and (1, 1), and we know already that the first and last have overall scores 
= f (0) and 1 = v{N). What about the 2 remaining ones ? There are two extreme 
situations, under isotonicity. 

• tidl}) = v{{2}) = 0, which means that for the DM, both criteria have to be 
satisfactory in order to get a satisfactory tuple, the satisfaction of only one criterion 
being useless. We say that the criteria are com,plem,entary. 

• v{{l}) = v{{2}) = 1, which means that for the DM, the satisfaction of one of the 
two criteria is sufficient to have a satisfactory tuple, satisfying both being useless. 
We say that the criteria are substitutive. 

Clearly, in these two situations, the criteria are not independent, in the sense that the 
satisfaction of one of them acts on the usefulness of the other in order to get a satisfactory 
tuple (necessary in the first case, useless in the second). So we may say that there is some 
interaction between the criterisl^. 

What should be a situation where no interaction occurs, i.e. criteria act indepen- 
dently ? It is a situation where the satisfaction of each criterion brings its own contribu- 
tion to the overall satisfaction, hence: 

.;({!, 2}) = ^({l}) + .;({2}). 

Note that in the first situation, t>({l,2}) > t^dl}) +f({2}), while the reverse inequality 
holds in the second situation. This suggests that the interaction /12 between criteria 1 
and 2 should be defined as : 

h,:=v{{l,2})-v{{l})-v{{2}) + vm. (1) 

This is simply the difference between binary tuples on the diagonal (where there is strict 
dominance) and on the anti-diagonal (where there is no dominance relation). The in- 
teraction is positive when criteria are complementary, while it is negative when they are 
substitutive. This is consistent with intuition considering that when criteria are comple- 
mentary, they have no value by themselves, but put together they become important for 
the DM. 

In the case of more than 2 criteria, the definition of interaction is more tricky but 
follows the same idea (see below). In fact, when n > 2, we may define the interaction 
between 3,4, ... ,n criteria as well. The general definition of interaction for capacities 
has been given in [S], and has been axiomatized in [12j. 

The above story for introducing interaction can be made fairly more general. Let 
us first take interval [—1,1] instead of [0, 1] for expressing scores, and consider that 



^For further discussion on substitutive and complementary criteria, see Marichal [16j . 



for the DM, values —1, and 1 are particular because they express respectively total 
unsatisfaction, neutrality and total satisfaction. Then we are led to consider ternary 
tuples (lyi, — l_B,0(^uB)05 whose overall score is denoted by v{A,B). It is convenient to 
denote by Q{N) := {{A, B) \ A,B C N,Ar]B = 0}. Now v is defined on Q{N), and as 
for capacities, it seems natural to impose v{N,(/}) := 1, v(0,0) := 0, and f(0, A^) := —1. 
Also using the dominance argument, we should have, if A C A', v{A, B) < v{A', B) 
and v{B,A) > v{B,A'). Such a f is called a bi-capacity fiUl [9]. The interaction for 
bi-capacities, called hi-interaction in [9j, has been defined accordingly, and follows the 
same principle. When n = 2, since we have 3 particular levels —1, and 1, the square 
[—1, 1]^ is divided into 4 small squares and has 9 ternary tuples. In each small square, we 
apply the same definition as with capacities, i.e. Eq. ([T]). Hence, we have four interaction 
indices to describe interaction with n = 2, namely (see Figure [1]): 



-^{l,2},0 
^,{1,2} 

4l 



^;({1, 2}, 0)-i;({2},0) -.;({!}, 0) + i;(0,0) 
^;(0,0)-t;(0,{l})-i;(0,{2}) + i;(0,{l,2}) 
t;({l},0)-t;(0,0)-^({l},{2}) + ^(0,{2}) 
z;({2},0)-t;({2},{l})-^(0,0) + ^;(0,{l}) 



/({1,2},0) 

/(0,0) 
/({i},0) 

/({2},0). 



(2) 



The notation Ia,b means that criteria in A are positive, while criteria in B are negative. 

^({2},{1}) 



^(0,{n) 



^^TJ^ 






(-1,0) 



(-1,-1) 



(0,0) (1,0 



(0,-1) (1,-1 



K{i},0) 



^(0,{1,2}) v{^,{2}) ^^({1},{2}) 



Figure 1: Ternary tuples when n = 2 

As it will become clear later, a better notation is I{A,B), where {A^B) is the ternary 
tuples corresponding to the upper right corner of the square in consideration (i.e. the 
best possible tuple in the square). 

Let us now take a general point of view. We consider n-dimensional tuples in X : = 
Xi X ■ ■ ■ X Xn, where it is assumed that each Xi is a partially ordered set, whose order 
relation is denoted by <j. We consider that on each dimension Xj, there exist reference 



levels r\, . 



which for the problem under consideration, convey some special meaning 



of interest, describing e.g. some particular situation, and that these reference levels form 
a lower locally distributive lattice {Li, <j). Denoting by L := Li x ■ ■ ■ x L„ the product 
lattice with the product order, we define a real function v : L — > M, assigning a real 
value to any combination of reference levels on each dimension. 



Let us give some instances of this general framework. 

voting games and ternary voting games: defining A^ := {1, . . . , n} as the set of vot- 
ers, for each voter there exist two or three reference levels, which are: voting in fa- 
vor, voting against (case of classical voting games), and abstention (case of ternary- 
voting games [6|). For classical games, we have Li = {0,1}, Vi G A^, with level 1 
corresponding to voting in favor, so that L = 2", and v{A) = 1 if the bill is accepted 
when A is the set of voters voting in favor, or v{A) = if the bill is rejected. For 
ternary voting games, we have Lj = { — 1,0, 1}, with corresponding to abstention 
and —1 to voting against. Then L = 3", and it is convenient to denote an element 
of L by a pair {A, B), where A is the set of voters in favor, and B the set of voters 
against. As before, v{A,B) = 1 (the bill is accepted) or (the bill is rejected). 
Note that here Xi coincides with Li,\/i & N. 

cooperative games and bi-cooperative games: we replace voters by players. Ref- 
erence levels in the case of cooperative games are and 1, corresponding to non 
participation and participation in the game. Hence L = 2", and v{A) is the asset 
that the coalition A of players will win if the game is played. For bi-cooperative 
games, L = 3^, and v{A, B) is the asset that A will receive when coalition A plays 
against coalition B, the remaining players not taking part in the game. Classically, 
here also Xi coincides with Li, although one may consider any degree of participa- 
tion between full participation and non participation (fuzzy games), which leads to 
X, = [0,1]. 

multicriteria decision making: this corresponds to the framework given in the in- 
troduction. We have Li = {0, 1} for all z G A^ if we consider only two refer- 
ence levels "unacceptable" and "totally satisfying", which leads to capacities, and 
Li = {—1,0,1} if a neutral level is added, which leads to bi-capacities, as ex- 
plained above. Let us remark that our general framework allows one to be much 
more general: one may have more than 3 levels, adding for example intermediate 
levels such as "half satisfactory", etc., or even introduce non comparable levels, 
provided the lattice structure is preserved. For example, the level "don't know" 
may be incomparable with "neutral", but smaller than "satisfactory" and greater 

than "unsatisfactory", thus leading to the lattice 2^. 

satisfactory 



neutral cQ P don't know 




unsatisfactory 
In addition, we may consider different Li for each criterion. The function v defines 
the overall score given to an tuple having various reference levels on criteria. 

data analysis: the construction is the same as for multicriteria decision making, but 
the meaning conveyed by the dimensions and the reference levels can be much 



more general, depending on the kind of data, being for example "high" , "medium" , 
"low", etc. We do not even need to have numerical dimensions, so that ordinal data 
analysis can be done. The meaning of v{x) for x G L depends on the aim of the 
analysis. We propose three main examples: 

• evaluation of x. For example, x is some kind of prototypical product, and 
a user or consumer gives an evaluation of it, which defines v{x) (subjective 
evaluation). 

• classification in some category. v{x) is the label of the category, or takes value 
or 1 (does not belong or belongs to a given category: in this latter case we 
need as many functions v as the number of categories) (pattern recognition). 

• the number of items identical or similar to x in the data set (data mining). 
Suppose we have a large set D of data with some distance defined on it. x E L 
defines a particular protopyical datum. Then v{x) is the cardinality of the set 
of data x' G -D within a given distance of x, or v{x) is the sum of the inverse 
distances from any x' E D to x. 

We propose in this paper a general definition of interaction, which can be applied to 
the above defined framework, and encompasses already existing definitions of interaction 
for capacities and bi-capacities. The precise meaning of interaction is governed by the 
meaning of the function v. In game theory, it describes the synergy between players or 
voters, the interest to forming or not forming certain coalitions. In multicriteria decision 
making, it tells which criteria play a key role (and how), which criteria are redundant (with 
which one) in the decision process. In data mining, when v is a counting function as above, 
the interaction has a statistical flavor close to correlation. Indeed, since the interaction 
index is roughly speaking a difference of the diagonal and ant i- diagonal, a positive (resp. 
negative) interaction corresponds to a positive (resp. negative) correlation. In pattern 
recognition, interaction is very informative for feature selection (see an application of 
interaction in this topic in [T|). 

Clearly, the interaction is a key concept in knowledge discovery, and has a strong 
descriptive power. We detail its construction and properties in the sequel, after recalling 
classical results. 

For simplicity, the cardinahty of sets A,B,S,... will be denoted by the corresponding 
lower case a, b,s, . . ., and we will often omit braces for singletons. We put A^ := {1, . . . , n}. 

2 Importance and interaction indices for L = 2^ and 

We recall in this section the classical definition for L = 2", (which corresponds to ca- 
pacities, or more generally set functions, pseudo-Boolean functions [E]), and the one for 
L = 3" (bi-capacities, bi-cooperative games). 

Let V : 2^ — > M, with ij(0) = (game). As it will become clear, the interaction index 
is a generalization of the power index or importance index (p^{i), i E N, which expresses 
to what extent an element i E N (attribute, dimension) has importance or power for the 



problem under consideration. The general form is: 



b^ii) 



J2 al[viSUt)-v{S)], (3) 

SCN\i 



a] G M. The value of the coefficients a] has to be determined by additional requirements. 
The most important example is the Shapley index |19J, where 



1 (n — s — l)\s\ , , 

a\=^- -^, s = 0,...,n-l, (4) 



* n! 



obtained by the following property: ^YTi=\ ^*'(^) — ^(^)) expressing a sharing of the total 
value among all elements, according to their importance (^efficiency axiom). Another 
classical example is the Banzhaf index P^, where a] = 2^^, s = 0,...,n — 1. 

The interaction index [8j expresses the interaction among a coalition (group) S (^ N 
of elements: 

/"(^)= E ^t^sviT), (5) 

TCN\S 

where af G M, and Asv{T) is the derivative of v w.r.t. 5* at T for S ^ N\T, and defined 
recursively as follows: 



A0^(T) 

A,t;(T) 
A5^;(T) 



viT) 

v{TUi) -v{T) 

A,iAs\,v{T)), \S\ > 1. 



Observe that /^({i}) = 4''"{i), hence an interaction index is a generalization of an impor- 
tance index. It is possible to define recursively the interaction index from the importance 
index [12]. Then, choosing a particular importance index (hence the coefficients a]) de- 
fines uniquely the coefficients a^. Let us introduce some notations, borrowed from game 
theory. The restricted game ti^\^ is the game v restricted to elements (players) in N\K, 
hence v^'^^{S) = v{S) for any S C N\K, and is not defined outside. The reduced game 
v^^^ is the game where all elements in K are considered as a single element denoted by 
[K], i.e. the set of elements is then N\^kj := {N\K)U{[K]}. The reduced game is defined 
by, for any S (^ N \ K: 

V[K]{S) =v{S) 

V[K]{SU{[K]})=v{SUK). 
The recursion axiom writes 

r{s) = r^'\[s])- Yl i"''^''{s\K). (6) 

KCS,K^il),S 

Its meaning is simple when IS"! = 2. Indeed, the formula can be written as 

It means that the importance of elements (e.g. players) i,j taken together is the sum of 
individual importances when the other is absent, and the interaction they have between 

6 



them. Hence a positive interaction means that the overall importance of i,j is greater 
than the sum of their respective marginal importances (see [12j for another equivalent 
axiom) . 

This axiom leads to the following formula for al{n), the argument indicating the 
number of players in the game 

al{n) = al{n-t + l), Vs = 0, . . . ,n - t, Vt = 1, . . . , n - 1. (7) 

When (j)^ is the Shapley index, we obtain the Shapley interaction index, whose coefficients 
are, using (I?]): 

{n-s-t)\t\ 
n/ -^ - - 

*■ {n-s + iy.' 

We have generalized the above notions to the case of bi-capacities and bi-cooperative 
games [9|, HI], and given an axiomatization pTl IT5] . As explained in Section [H we have 
to consider all combinations between positive and negative parts of the Xj's (see Eq. 
([2])), and following the notation introduced there, we denote by Is,t, {S,T) G Q{N), the 
interaction among elements when S is the set of positive elements, and T is the set of 
negative elements. The Shapley index divides into two indices /{i,0} and /{0,i}, defined by: 

h^M ■= E ^'''''^^''' ^^MS, N\iSU t)) (8) 

SCN\i 

hm ■■= E ^''~'^7^^''' ^0,M>g, N \ S) (9) 

SCN\i 

where the derivatives are defined by: 

A,„0t;(5, T) := v{S U z, T) - v{S, T), {S, T) G Q{N \ i) 
A^XS,T):=v{S,T\t)-v{S,T), {S,T)eQ{N),S^i,T3t. 

Ai^Qv{S,T) is the contribution of element i when it acts as a positive element, while 
A^^iv{S, T) is the (negative) contribution of i when acting as a negative element. Hence 
the two above Shapley values are average contributions of an element when it acts as a 
positive or as a negative element. 

The coefficients are obtained through an efficiency axiom which reads: 



E [Hh 0) + ^(0, 0] = ^iN, 0) - t;(0, A^). 



ieN 

As above, the derivative As^t can be defined recursively from these equations, and the 
definition of the Shapley interaction index is: 

KCN\{SUT) ^ ' 



3 Mathematical background and general framework 
for interaction 

We try now to have a general view of previous definitions, thanks to results from lattice 
theory. We first introduce necessary definitions (see e.g. [2llH [13]). 

Let (L, <) be a lattice, we denote as usual by V, A, T, _L supremum, infimum, top and 
bottom (if they exist). If x and y in L are incomparable, we write x\\y. Q C L is a 
downs et of L if x G Q and y < x imply y E Q. For any x G L, the principal ideal | x is 
defined a.s [ x := {y E L \ y < x} (downset generated by x). For x,y E L, we say that 
X covers y (or y is a predecessor of x), denoted hj x >~ y, if there is no z E L, z ^ x,y 
such that X > z > y. (L, <) is lower semi-modular (resp. upper semi-modular) if for all 
x,y E L, X y y y X and x\/ y )~ y imply x >~ x Ay and y y x A y (resp. x y x A y and 
y y xAy imply xVy >~ x and x\/y >~ y). A lattice being upper and lower semi-modular is 
called modular. A lattice is modular iff it does not contain N^, as a sublattice (see Fig. [2]). 
A lattice is distributive when V, A satisfy the distributivity law, and it is complemented 
when each x E L has a (unique) complement x', i.e. satisfying x\/ x' = T and x Ax' = -L. 
A modular lattice is distributive iff it does not contain M3 as a sublattice (see Fig. [2]). A 
lattice is linear if it is totally ordered. A lattice is said to be Boolean if it has a top and 
bottom element, is distributive and complemented. When L is finite, it is Boolean iff it 
is isomorphic to the lattice 2" for some n. 





Figure 2: The lattices M3 (left) and N5 (right) 

(L, <) is said to be lower locally distributive if it is lower semi-modular, and it does 
not contain a sublattice isomorphic to M3. Equivalently, it is lower locally distributive if 
for any x E L, the interval [A^^x V^ ■'^] ^^ ^ Boolean lattice (see [IT] for a survey). 

An element i E L is join-irreducible if it cannot be written as a supremum over other 
elements of L and it is not the bottom element. When L is finite, this is equivalent to i 
covers only one element. Let us call J^{L) the set of all join-irreducible elements of L. 

In a finite distributive lattice, any element y E L can be decomposed in terms of 
join-irreducible elements. The fundamental result due to Birkhoff is the following. 



Theorem 1 Let L be a finite distributive lattice. Then the map rj 
where 0{J) is the set of all downsets of J , defined by 

r^{x) := {i E J{L) \ i < x} = J{L)n j x 

is an isomorphism of L onto 0{J{L)). 



0{J{L)), 
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We call r]{x) the normal decomposition of x, we have 

X = y r]{x). 

The isomorphism says that x < y iS ri{x) C ri{y), hence r]{x\/ y) = ri[x) Ur]{y) and so on. 

The decomposition of some x in L in term of supremum of join-irreducible elements 
is unique up to the fact that it may happen that some join-irreducible elements in rjlx) 
are comparable. Hence, ii i < j and j is in a decomposition of x, then we may delete i 
from the decomposition of x. We call minimal decomposition the (unique) decomposition 
of minimal cardinality, denoted by r]*{x). Atoms are join-irreducible elements covering 
±. A lattice is atomistic if all join-irreducible elements are atoms. A finite distributive 
atomistic lattice is Boolean. 

As shown by Dilworth [5J, any x & L has a unique join-irreducible minimal decompo- 
sition iff it is lower locally distributive. 

A useful result is the following 

ix = {y\ viy) =\Jj, KC r]ix)}. (10) 

jeK 

When L = 2", join-irreducible elements are simply atoms (i.e. singletons of iV). When 
L = 3", join-irreducible elements are {i,i'^) and (0,'i'^), Vi G N. 

Let (L, <) be a locally finite partially ordered set, and a function g : L — > M. 
Consider the following equation 

g{x) = J2m- (11) 

y<x 

There is a unique solution / : L — > M to this equation, called the Mobius transform of 
g (see Rota pK])- Note that in a sense, / could be considered as the derivative of g. 

As said in the introduction, our general framework for the definition of interaction will 
be to consider finite lower locally distributive lattices Li, . . . , L„, with top and bottom of 
Li denoted by Tj, ±j, i = 1, . . . , n, and the product lattice L := Li x ■ ■ ■ x L„ with the 
product order. Sometimes, we will need in addition that the L^'s are modular (hence they 
are distributive). We set A^ := {1, . . . , n}. A vertex of L is an element x = (xi, . . . , x„) 
of L where Xi is either Tj or ±i, for i = 1, . . . ,n. We denote by T{L) the set of vertices 
of L. Note that if L is a Boolean lattice, then L = T{L). For Q{N), vertices are of the 
form (A,A'=), ACN. 

Since Lj is finite and lower locally distributive, it can be represented by join-irreducible 
elements. Then join- irreducible elements of L are simply of the form 

i = (±1, . . . , -Lj-i, io, -Lj+i, • ■ ■ , J-n), 

for some j G {!,..., n} and some io G J{Lj). Hence, there are X]?=i l<^(-^i)l Joi^" 
irreducible elements in L. 



4 Derivative of a function over a lattice 

Let (L, <) be a finite lower locally distributive lattice, and / : L — > R a real-valued 
function on it. 

9 



Definition 1 Let i G J{L). The derivative of f w.r.t. i at point x & L is given by: 

AJ{x):=f{xyt)-f{x). 

Note that Aj/(x) = if z < x. We say that the derivative Aj/(x) is Boolean if [x, xV i] 
is the Boolean lattice 2^, otherwise said xM i >~ x. Differentiating two times w.r.t two 
join- irreducible elements i,i such that i\\j {i and j are incomparable) leads to: 

A,(A,/(a;)) = A,(A,/(x)) = /(x V z V j) - f{x V z) - f{x V j) + /(x). 

We call this quantity the second derivative w.r.t i,j or the derivative w.r.t i V j, denoted 
by Ai\/jf{x). Note that allowing i < j leads to Aiyjf{x) = — Aj/(x). 

Using the minimal decomposition, the derivative w.r.t any element y can be defined. 

Definition 2 Let x,y G L, and y = V^^^^z^ be the minimal decomposition of y into 
join-irreducible elements. Then the derivative of f w.r.t y at point x is given by: 

A,/(x) = A,,(A,,(---A,„/(x) ■■■)). 

The derivative is Boolean if [x, xV y] is the Boolean lattice 2". The derivative is if for 
some k, ik < X. The following lemma gives practical equivalent conditions. 

Lemma 1 Let x,y E L. 

(i) The derivative Ayf{x) is whenever rj{x) n r]*{y) ^ 0. 
(a) The derivative Ayf{x) is Boolean iff rj{x \/ y) = rj{x) U U '?*(?/) ■ 

Proof: (i) Let k G ri{x) fl rj*{y). Since k G //(x), all join-irreducible elements below k 
are also in ?7(x), hence r]{k) C r/(x). By Th. [H this is equivalent to A; < x, which in turn 
implies that the derivative is since k G vi*{y). 

(ii) Let us consider first y = i E J{L), and suppose Aj/(x) is Boolean. Since x V z :^ x, 
by isomorphism, we have ri{x \/ i) >- 'r]{x), which means that there exists some k G J^{L) 
such that 'r]{x V z) = 'r]{x) U {k}. Since r]{x V i) = r]{x) U r]{i), k G f]{i), and all other 
j G ri{i) belong also to ^^(x). Hence k = i = ri*{i) since ri{i) = {j G J^{L) \ j < i}. The 
converse is clear. Applying recursively this result proves (ii). ■ 

As a consequence of (ii), the lattice [x, x V y] is isomorphic to (V{ri*{y)), C). 
We express the derivative in terms of the Mobius transform of /. 

Proposition 1 Let i be a join-irreducible element such that A.j/(x) is Boolean. We 
denote by m the Mobius transform of f . Then 

y£[i,xV«] 
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Proof: We have: 

y<xVi y<x yei(xVi)\ix 

since | x C| (x V i). Using Lemma [H (ii), we have r^(x V z) = ri{x) U {i}. Applying ( TTOl) . 
we get 

l{xy i)\ix = {y\ ri{y) = |J j U {z}, K C 77(x)} = [i, x V i] 

since we get z for K = ^, and x V z for K = rj{x), and the set is clearly an interval. ■ 

Based on this, we can show the general result: 
Theorem 2 Let x,y E L, such that Ayf{x) is Boolean. Then 

^yf\x) = Yl "^(^)- 

z(^[y,xVy] 

Proof: We proceed by recurrence on \r]*{y)\. The result is already shown for \ri*{y)\ = 1. 
Let us suppose it holds for some y, and consider y' = y \/ i, with z ^ ri{y) and Ay/f{x) 
being Boolean. We have: 

Ay, fix) = A,{Ayf{x)) 

= Ay fix yt)- Ay fix) 

= Y "^(^) - Y "^(^^ 

z£[y,xVyVi] z&[y,x'\/y] 

Y "^W- 

ze[y,xVyVi]\[y,xVy] 

Since [y, x V y] = {z \ ijiz) = r]iy) U [jj^jj, J C r]ix)} and [y,x y y y i] = {z \ rjiz) = 
V{y) U Ujgji' J ^ vi^) U {z}} we get 

[y,xVyy i]\[y,xyy] = {z\ 77(2) = r]iy) U |J j U {z}, J C r]ix)} = [y, z/' V x], 
the desired result. ■ 

The close link between our derivative and Mobius transform is not surprising since 
the Mobius transform has already a meaning of derivative. 

Let us apply these results to the case of usual capacities and bi-capacities. It suffices 
to check if formulas coincide for join-irreducible elements. For capacities, we have for 
any i E N, Ajf (A) := viA U z) — viA), so that we recover the definition above. Note 
that this coincides with the notion of derivative for pseudo-Boolean functions [14j. For 
bi-capacities, we have 

A(i,ic)z;(A, B) = viA U z, S) - viA, B) = Ai,0t;(A, B) 
A(0,,c)z;(A, B) = viA, B\t)- viA, B) = A^^.viA, B), 

which again coincides with the definition given above, although notation differs. 
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5 Interaction: the general case 

As seen in Section |2], the definition of tlie derivative is tlie key concept for tlie interaction 
index. Using our general definition of derivative witli new notation, let us express the 
interaction when L = 3" using the notation A(^s,t)- Imposing the same argument to I 
and A, we are led to: 

r{S, T)=J2 ^l7!!]\f ^(s,T)v{K^ N\{KUS)), (12) 



KCT 



(t + 1)! 



with the correspondence 1st — ^^{'S, (SUTY). Observe that these two notations precisely 
correspond to those introduced in Eqs. ([2]). 

We remark that the derivative in the above expression is taken over some vertices of 
Q{N \ S). Also, the importance index corresponds to derivatives w.r.t. join-irreducible 
elements. Based on these observations, we are now in position to propose a definition 
using our general framework (see Section [3]). Roughly speaking, the interaction index 
w.r.t. x G L is a weighted average of the derivative w.r.t. x, taken at vertices of L 
"not related" to x. The weights can be determined recursively from the cases where x 
is a join-irreducible element, and the coefficients for these cases are determined by some 
normalization condition (e.g. efficiency- like condition in the case of the Shapley index). 

5.1 Definition of interaction 

We begin by defining the importance index, i.e. interaction index w.r.t. a join-irreducible 
element. 

Definition 3 Let i = (_Li, . . . , -Lj-i, io, -Lj+i, • • • , J-n) be a join-irreducible element of L. 
The interaction w.r.t. i of v is any function of the form 

/(z):= Yl (^H.^vix), (13) 

where Iq is the (unique) predecessor of i^ in Lj, h{x) is the number of components of x 
equal to Ti, I = 1, . . . , n, and a\ &M. for any integer k. 

Observe that the constants aL^^ do not depend on i. Also, the derivative is Boolean. 

Let us show that this definition encompasses the case of capacities and bi-capacities. 
For capacities, L^ = {0, 1} for all k, with 1 as unique join-irreducible element, join- 
irreducible elements of L = 2" are singletons, all elements in L are vertices, and h{x) is 
the cardinality of sets. Thus we get for a singleton j G A^: 

Hj)= E ^\AHAUj)-viA)] 

ACN\j 

as desired. For bi-capacities, L^ = { — 1,0, 1} for all k, with J{Lk) = {0, 1}. The height 
function is h{A, B) = \A\. Let us consider first the case where the join-irreducible element 
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in L = 3" is {j,j'^), or in vector form (—1, . . . , —1, 1, —1, . . . , —1), where 1 is at the jth 
place. Then r(3"'^^) x {0} x r(3"~-') corresponds to vertices of Q{N\j). Thus we obtain 

Iij,f)= Yl c^UAu,MA,N\iAUj)) 

ACN\j 

which has the required form. Let us examine now the case of (0,i'^), which is, in vector 
form, (-1, . . . , -1, 0, -1, . . . , -1). This time T{3^-^) x {-1} x 1(3"-^) is T{L), after 
removal of vertices {A, A'^) with j G A. In summary, we obtain: 

J(0,/)= J2 «|AiA(0,.=)^(AA^\^) 

ACN\j 

which has again the required form. 

Let us generalize Def. [H] to a class of elements of L denoted by L and defined as 
follows: L := [Jkcn-^k, with 

Lk := {x E L \\/k E K, 3! i^ E L^ such that V« E rj*(xk), i >- ik, and Xk = -L^ ii k E N\K} 

In words, it is the set of elements whose coordinates are either bottom or such that the 
minimal decomposition covers a unique element. Observe that for the case where Lk is a 
linear lattice or an atomistic one (i.e. practical cases of interest), L = L. 

Definition 4 Let K C N, x E Lk, and denote as above by ik, for all k E K, the element 
covered by alii E r]*{xk). The interaction w.r.t. x of v is any function of the form 

H^)--= E «S)^^^(2/) (14) 

y\yk=^k or ±k ifk^K,yk=h else 

where h{y) is the number of components of y equal to Ti, I = 1, . . . ,n. 

The derivative is Boolean if in addition the L^'s are modular (and hence distributive), 
by application of the following Lemma. 

Lemma 2 If L^ is distributive, k = 1, . . . ,n, then for any K (1 N, any x E Lk, Aj.v{y) 
is Boolean for any y such that y^ = T^ or J^^, k ^ K, and y^ = ik, where ik is the 
element covered by all i E r]*{xk)- 

Proof: We have to prove that [y, xM y] is isomorphic to 2^'^*^^'>\ with y defined as above. 
It suffices to prove that [ykiXk V yk\ is isomorphic to 2^"^*^^^-^^ for each coordinate k. If 
k ^ K, then [yk,Xk V yk] = {yk} — 2°. If k E K, then yk is covered by all i in rj*{xk). 
Hence [yk,Xk\/ yk] = [ik,Xk], which is atomistic. Since it is also distributive, it is Boolean 
and isomorphic to 21^**^^*=)!. ■ 
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5.2 Expression with the Mobius transform and efficiency 

Let us express I{x) w.r.t the Mobius transform. First we recall the result for bi-capacities, 
which writes P| [TT]: 

{S',T')G[(5,T),(5UT,0)] 

We have the following general result. 

Theorem 3 Let K ^ N, and assume distributivity holds for every L^, k G K. The 
expression of the interaction index for x G Lk in terms of the Mobius transform is given 
by: 

zG[x,x] 

with Xk '■= (Tfc) for k ^ K , Xk = Xk else, and k{z) is the number of coordinates of z not 
equal to X^, / = 1, . . . ,■«,. Moreover, the real constants /^LA are related to the aLJ-, 's by: 

Proof: Since the derivative is Boolean by Lemma [2], we can apply Th. [2], and we get: 

y\yk=~^k or ±k if k^K,yk=ik^ else z£[x,yVx] 

Then for any y such that yk = T^ or ±fc if fc ^ K, and yk = h else, {y V x)k = Xk when 
k E K, other coordinates being T^ or ±fc, in any combination. Hence for all possible 
such y, z takes any value in [x, (Tnxk^xk)], where iT^\K,XK) has coordinate T^ when 
k ^ K, and Xk else. Denoting by x the right bound of this interval, we get 

I{x) = E f^zmiz). 

ZS:[x,x] 

It remains to express Pz in terms of on^l)- Let us take a fixed z G [x, x] and examine for 
which y's in flT6l) it belongs to [x, ?/ V x]. Note that Zk = Xk for all k E K. Since yi, I ^ K 
is either ±; or T/, we must have yi = Ti whenever zi y^ -Li, the other coordinates not in 
K being free. The result is then: 

y\y,=T, iiztj^±i,l^K 

Denoting by k{z) the number of coordinates not equal to _L;, we get 

n-k{z) / r / \\ 

n-k{z)\ iK\ 



1=0 ^ ^ 



afc(2)_|i^l+i 
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Remarking that (3z depends only on k{z) and \K\, we get the desired result. ■ 

Let us check if we recover the coefficients for bi-capacities and Shapley index. For 
{S,T) = {i,i^) and (0,«^), we have I3(^s',t') = :;^- We apply (TT5|) . noting that {S',T') has 
n — t' coordinates different from bottom: 



t' 



t' 



t'\ {n - t' - I + l)\{t' - l)\ 



I n\ 

^ ^ 

In [H], the following combinatorial result was shown: 

^{n-i-l)\k\ _ 1 
^-^ n\(k — i)\ n — k 

Applying the above formula with i = t' — I, we get the desired result. 

It is possible to find easily the PL^) coefficients if we consider a normalization condition 
as for the Shapley index. Let us define efficiency as 

J] m=v{T)~v{±), (17) 

and call Shapley interaction index the resulting interaction index. Applying Th. [3], we 

get: 

i&J(L) ieJ{L) ze[i,i] 

Let us take m such as it is non zero only for a given z & L, say zq, such that for all 
coordinates zi different from bottom, we have zi G J{Li). Since ^^.g^."^!^) — '^(T), 
we have necessarily m{zQ) = v{T) — v{l-). Observe that zq belongs to all intervals [i,i\ 
such that Zq > i and zq < i. Recalling that i = (±i, . . . , ±j_i, zq, -Lj+i, . . . , -L„) and 
z = (Ti, . . . , Tj_i, Zq, Tj+i, . . . , T„), if z has only coordinate zi ^ ±;, then only i such 
that ii = zi is suitable. More generally, if z has only k coordinates different from bottom, 
then we have only k choices for i. Hence, for such z 

.1 _ v{T)-vi±) 1 

Pk( 



^^'^ k{z)[v{T) - v{l.)] k{z)' 
Let us apply this to the Shapley index for bi-capacities. We get: 
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Suppose the (3h^)^ are determined by some rule, as above. Since k{z) takes values in 
{l,...,n}, there are n coefficients l^k(z)^ while for Ci\(x)^ 

h{x) G {0, . . . , n — 1}, so that there are also n coefficients. Th. [3]tells us that a^, . . . , a^_i 
can be computed from /?[,..., /?^ by solving the triangular linear system (IT^ . Since there 
is no on the diagonal of the matrix, there is always a unique solution to this system. 
Applying this observation to the Shapley interaction index, we get the following result. 

Theorem 4 When L^ is distributive, for all k = 1, . . . ,n, the coefficients al, . . . , a^_i 

of the Shapley interaction index I (i), i G J^{L) (i.e. satisfying Def. \^and (T^) are given 

by: 

1 {n-l-k)\k\ 



«fe 



n\ 



5.3 The recursion axiom for the hnear case 

Let us generalize the recursion axiom ([6]) to compute I{x), with the following additional 
restriction: all L^'s are linear lattices. Hence, all previous results apply. Also, all deriva- 
tives involved are Boolean. 

Let J ^ N, and consider x such that x^ = -L^ ii k ^ J, and Xk = ik else, for some 
ik G J{Lk). We denote as before by 4 the unique predecessor of ik- We introduce 
additional notations. For any K CI J,K ^ ^,J, the function v restricted to Y\k£N\K ^k 

is denoted by Vx and defined by: 

v^\^{y):=v{y'),^ithyi:=h^^ f^^^, Wy e H Lk. 

The function v reduced to x is a function t;'^' defined on HfceivVJ -^^^ ^ {-'-[a;], ~^lx]} by: 
v^^\y) := v{(f)ix]{y)), Wye JJ L^ x {±[^],T[^]}, 

k£N\J 

and (pix] : llkeN\.j ^k x {±[^.], T[^]} — > L is defined by 

{ik, iik e J and y^^] = T[^.] 
ik, a k e J and yi^] = -L[x] 
yk, iik ^ J. 

We propose the following recursion formula: 

r(x) = r'^'(±^v,T[.])- Y. i''^'\^n\k), (18) 

KQJ,K^%,J 

where -LAr\j stands for the vector {^.k)k<^N\J, and x\n\k is the restriction of x to coordi- 
nates in N \K. 

Let us check if we recover ([6]) for capacities. Taking S C N, the restricted game v^ 
for 7^ iT C S", is defined by Vg (T) = v{T) if T C N \ K, and does not depend 
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on S. The reduced game is defined over A^ \ S" U {[S*]}, and (/>(T) = T ii T ^ [S], and 
T \ {[S*]} U S" else. Now observe that (X^ry, T[^.]) writes [S] in our case, so that the 
formula is recovered. 

The following result holds. 

Theorem 5 Denoting by a'l{n) the coefficients al. involved into [14\ ), the recursion for- 
mula ( fl^) induces the following recursive relation: 



«feW = "fc(^-i + l)) VA; = 0, ...,n- j, Vj = 1 . . . , n. 



(19) 



Proof: We prove the result by recurrence on j := | J|. It is obviously true for j = 1, and 
let us assume it is true up to j — 1. Simplifying notations, the left term in (1181) writes: 

yN\j&{Y{k^N\j ^k) s/iv\jer(nfcgjv\j ^k) 

yj=-h_ 

where yA indicates the vector y restricted to coordinates in A, and ij is the vector with 
coordinates i^, k E J. Using similar notations, the right term writes: 

yN\J&T{UkeN\J Lk) 

y[x]=-^{x] 

- E E 

yj\K= ij\K 

E ^h(yj,^j)ilT'-J + '^) ^Ti,/''\yN\J,M^\)- Yl ^cc^\K^^''^{yN\J,ij\K) 



<iU,)i^-k)^^.^K<''''iy) 



yN\J&^{YlksN\J ^k) 



e^A'cJ 



where equality comes from the recurrence hypothesis. Hence, Eq. (fTHl) is equivalent to: 



E 

yiv\jer(nfcgjv\j-^fc) - 



a 



HyN\ 



^■^{n)A^v{yN\j,ij) - Oi^y^^^^in - j + 1) AT^^■^v^''\yN\J, M^]) 






Since the equality holds for any v, we should have for any yQ G TdlfeeAfXJ ^k)'- 






We are done if we prove that 






(20) 
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The derivative Axv{yo,ij) is the sum of terms ±v{z), with Zj = ij or Zj = ij whenever 
j G J. We may assume w.Lo.g. that J = {1, . . . ,j}. We associate to each such z a set 
K C J containing the coordinates where Zj = ij, and denote with some abuse of notation 
v{z) by v{K). Hence Axv{yo,ij) can be represented by the sum: 

v{J) - v{J \ {1}) - v{J \ {2}) -... + v{J\{l,2}) + ... {-ir\v{J \K) + ... {-lf\vm 

= 5^(-l)I^^J\K). 

KCJ 

Similarly, we have Ay f [^](|/o, -L[x]) = ''^{■J) ~ "^(0) by definition of v^^\ and 
A.,^,v!^\^{yo,zj;^)= $^ {-iy%{J \{K U L)). 

LCJ\K 

Using the last 2 expressions, the right side of ( |20l) writes: 

-fCCJ ^j^KcJ LCJ\K 

= E E {-l)^'^v{J\{KUL))-v{J) 

KCJ LCJ\K 

K'CJ k=0 ^ ^ 

= 0. 



Note that a-l{n) depends only on k and n — j. 

Using flT^ . we are now able to give the coefficients for the interaction index, which 
coincide with those of (IT^ : 

j {n — j — k)\k\ 



«fc 



(n-j + 1)! ■ 

6 Concluding remarks 

We end the paper by giving some interpretation of our definition of interaction, and 
indicate perspectives. 

Taking a particular combination of reference levels for dimensions in iT C A^, denoted 
by X in Def. HI we compute the "difference with alternate signs" between the value of 
the function v at this point x and point ix-, which is the combination of levels obtained 
by just removing one after the others the join-irreducible elements composing x. Now, 
for dimensions outside K, we consider only the combination of extreme values _Lfe,T/c, 
k ^ K, instead of all possible combinations of reference levels, which would have been 
too much complicated. The interaction index I{x) is just the weighted average of all 
these "difference with alternate signs" between x and ii£, computed over all possible 
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combinations of _Lfc,Tfc, for k ^ K. To our opinion, this is the simplest possible way 
to define it, encompassing classical cases of L = 2" and 3". Observe however that our 
definition cannot be applied for all x E L, but only to L (see definition in Sec. [5]). This 
restriction seems however of little effect, since it does not concern linear or atomistic 
lattices (which include, e.g.. Boolean lattices and the partition lattice), the most useful 
cases in practice. 

Results on the particular form of a\ remain simple and identical to the classical cases 
whenever the L^'s are distributive, since in this case derivatives become Boolean, hence 
the underlying structure of computation is identical to the classical case L = 2". For 
other cases, specific computations have to be done. 

Lastly, the recursion axiom permits to derive all coefficients «;[ from the a^'s, provided 
all Lfc's are linear. A further way of research would be to propose a more general formula, 
which seems however at first sight, difficult. 
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